Fault-Tolerant Distributed-Shared-Memory on a Broadcast-Based Interconnection Network
نویسندگان
چکیده
The Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus) is a low-latency, high-bandwidth interconnection network which directly links arbitrary pairs of processor nodes without contention, and can efficiently interconnect over one hundred nodes. Each node has a dedicated output channel and an array of receivers, with one receiver dedicated to every other node’s output channel. The SOME-Bus eliminates the need for global arbitration and provides bandwidth that scales directly with the number of nodes in the system. Under the distributed shared memory (DSM) paradigm, the SOME-bus allows strong integration of the transmitter, receiver and cache controller hardware to produce a highly integrated system-wide cache coherence mechanism. Backward Error Recovery fault-tolerance techniques can rely on DSM data replication and SOME-Bus broadcasts with little additional network traffic and corresponding performance degradation. This paper uses extensive simulation to examine the performance of the SOME-Bus architecture under DSM and Backward Error Recovery.
منابع مشابه
A comparison of Broadcast-based and Switch-based Networks of Workstations
Networks of Workstations have been mostly designed using switch-based architectures and programming based on message passing. This paper describes a network of workstations based on the Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus) which is a low-latency, high-bandwidth interconnection network that directly links arbitrary pairs of processor nodes without contention, and can effic...
متن کاملEvaluation of Real-Time Fiber Communications for Parallel Collective Operations
Real-Time Fiber Communications (RTFC) is a gigabit speed network that has been designed for damage tolerant local area networks. In addition to its damage tolerant characteristics, it has several features that make it attractive as a possible interconnection technology for parallel applications in a cluster of workstations. These characteristics include support for broadcast and multicast messa...
متن کاملEvaluation of Real-Time Fiber Communications for Prallel Collective Operations
Real-Time Fiber Communications (RTFC) is a gigabit speed network that has been designed for damage tolerant local area networks. In addition to its damage tolerant characteristics, it has several features that make it attractive as a possible interconnection technology for parallel applications in a cluster of workstations. These characteristics include support for broadcast and multicast messa...
متن کاملFault Tolerance and Performance of Multipath Multistage Interconnection Networks
In building a multiprocessor system, we can minimize the system's mean time to failure by providing an architecture resilient to component faults. We compare the fault tolerance and performance characteristics of various fault-tolerant multistage interconnection networks. We primarily focus on networks composed of dilated routing components. A dilated router features redundant outputs in each l...
متن کاملFault-tolerant Design for Multistage Routing Networks
International Symposium on Shared Memory Multiprocessing, 1991 20 Fault-Tolerant Design for Multistage Routing Networks Andr e DeHon, Thomas Knight Jr., and Henry Minsky As the size of digital systems increases, the average length of time between single component failures diminishes. To avoid component related failures, large computers must be fault-tolerant; that is, the computer must perform ...
متن کامل